Rank in Wordlist | Frequency | Word |
---|---|---|
2998 | 167 | 1,5 |
4782 | 100 | 2,5 |
4954 | 96 | 1,2 |
6241 | 74 | 0,2 |
6637 | 69 | 4,5 |
7132 | 64 | 0,5 |
7133 | 64 | 1,4 |
7237 | 63 | 1,3 |
7435 | 61 | 0,1 |
7436 | 61 | 1,1 |
Rank in Wordlist | Frequency | Word |
---|---|---|
7238 | 63 | 10% |
7325 | 62 | 100% |
7568 | 60 | 50% |
8576 | 52 | 30% |
9571 | 46 | 5% |
10160 | 43 | 20% |
11095 | 39 | 1% |
12171 | 35 | 7% |
12174 | 35 | 90% |
12494 | 34 | 40% |
Rank in Wordlist | Frequency | Word |
---|---|---|
4062 | 120 | S&P |
10669 | 41 | Pulp&Paper |
26772 | 14 | Pulp&Paperi |
28523 | 13 | Valga-Valka/Maks&Moorits |
40892 | 8 | A&O |
45655 | 7 | I&T |
46203 | 7 | PG&E |
51170 | 6 | H&M |
57826 | 5 | Anne&Stiil |
68498 | 4 | H&Mi |
Rank in Wordlist | Frequency | Word |
---|---|---|
82231 | 3 | A$AP |
Rank in Wordlist | Frequency | Word |
---|---|---|
40 | 7259 | ." |
Rank in Wordlist | Frequency | Word |
---|---|---|
8752 | 51 | Google'i |
9071 | 49 | Apple'i |
9279 | 48 | Neuville'i |
16947 | 24 | Assange'i |
17620 | 23 | Meeke'i |
18326 | 22 | Neuville'ile |
18371 | 22 | TransferWise'i |
22687 | 17 | France'i |
22778 | 17 | League'i |
23945 | 16 | Line'i |
Rank in Wordlist | Frequency | Word |
---|---|---|
21565 | 18 | 2+2 |
31991 | 11 | 2+1 |
45128 | 7 | 1+1 |
81565 | 3 | 0+7 |
81922 | 3 | 3+1 |
82209 | 3 | 90+2 |
82210 | 3 | 90+5 |
85759 | 3 | M+S |
105706 | 2 | 0+12 |
105707 | 2 | 0+5 |
Rank in Wordlist | Frequency | Word |
---|---|---|
182214 | 1 | I Wear* Experiment |
Rank in Wordlist | Frequency | Word |
---|---|---|
2095 | 234 | Kalev/Cramo |
5222 | 91 | Kalev/TLÜ |
6477 | 71 | Kehra/Horizon |
8975 | 50 | km/h |
9176 | 49 | m/s |
12504 | 34 | Aruküla/Audentes |
14348 | 29 | 2/3 |
14518 | 29 | ja/või |
15259 | 27 | 1/3 |
16331 | 25 | 1/2 |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots